38 research outputs found

    Coresets Meet EDCS: Algorithms for Matching and Vertex Cover on Massive Graphs

    Full text link
    As massive graphs become more prevalent, there is a rapidly growing need for scalable algorithms that solve classical graph problems, such as maximum matching and minimum vertex cover, on large datasets. For massive inputs, several different computational models have been introduced, including the streaming model, the distributed communication model, and the massively parallel computation (MPC) model that is a common abstraction of MapReduce-style computation. In each model, algorithms are analyzed in terms of resources such as space used or rounds of communication needed, in addition to the more traditional approximation ratio. In this paper, we give a single unified approach that yields better approximation algorithms for matching and vertex cover in all these models. The highlights include: * The first one pass, significantly-better-than-2-approximation for matching in random arrival streams that uses subquadratic space, namely a (1.5+ϵ)(1.5+\epsilon)-approximation streaming algorithm that uses O(n1.5)O(n^{1.5}) space for constant ϵ>0\epsilon > 0. * The first 2-round, better-than-2-approximation for matching in the MPC model that uses subquadratic space per machine, namely a (1.5+ϵ)(1.5+\epsilon)-approximation algorithm with O(mn+n)O(\sqrt{mn} + n) memory per machine for constant ϵ>0\epsilon > 0. By building on our unified approach, we further develop parallel algorithms in the MPC model that give a (1+ϵ)(1 + \epsilon)-approximation to matching and an O(1)O(1)-approximation to vertex cover in only O(loglogn)O(\log\log{n}) MPC rounds and O(n/polylog(n))O(n/poly\log{(n)}) memory per machine. These results settle multiple open questions posed in the recent paper of Czumaj~et.al. [STOC 2018]

    Multiplicative Bidding in Online Advertising

    Full text link
    In this paper, we initiate the study of the multiplicative bidding language adopted by major Internet search companies. In multiplicative bidding, the effective bid on a particular search auction is the product of a base bid and bid adjustments that are dependent on features of the search (for example, the geographic location of the user, or the platform on which the search is conducted). We consider the task faced by the advertiser when setting these bid adjustments, and establish a foundational optimization problem that captures the core difficulty of bidding under this language. We give matching algorithmic and approximation hardness results for this problem; these results are against an information-theoretic bound, and thus have implications on the power of the multiplicative bidding language itself. Inspired by empirical studies of search engine price data, we then codify the relevant restrictions of the problem, and give further algorithmic and hardness results. Our main technical contribution is an O(logn)O(\log n)-approximation for the case of multiplicative prices and monotone values. We also provide empirical validations of our problem restrictions, and test our algorithms on real data against natural benchmarks. Our experiments show that they perform favorably compared with the baseline.Comment: 25 pages; accepted to EC'1

    Improved Approximation Algorithms for PRIZE-COLLECTING STEINER TREE and TSP

    Get PDF
    Abstract — We study the prize-collecting versions of the Steiner tree, traveling salesman, and stroll (a.k.a. PATH-TSP) problems (PCST, PCTSP, and PCS, respectively): given a graph (V, E) with costs on each edge and a penalty (a.k.a. prize) on each node, the goal is to find a tree (for PCST), cycle (for PCTSP), or stroll (for PCS) that minimizes the sum of the edge costs in the tree/cycle/stroll and the penalties of the nodes not spanned by it. In addition to being a useful theoretical tool for helping to solve other optimization problems, PCST has been applied fruitfully by AT&T to the optimization of real-world telecommunications networks. The most recent improvements for the first two problems, giving a 2-approximation algorithm for each, appeared first in 1992. (A 2-approximation for PCS appeared in 2003.) The natural linear programming (LP) relaxation of PCST has an integrality gap of 2, which has been a barrier to further improvements for this problem. We present (2 − ɛ)-approximation algorithms for all three problems, connected by a unified technique for improving prizecollecting algorithms that allows us to circumvent the integrality gap barrier. 1

    Metric Clustering and MST with Strong and Weak Distance Oracles

    Full text link
    We study optimization problems in a metric space (X,d)(\mathcal{X},d) where we can compute distances in two ways: via a ''strong'' oracle that returns exact distances d(x,y)d(x,y), and a ''weak'' oracle that returns distances d~(x,y)\tilde{d}(x,y) which may be arbitrarily corrupted with some probability. This model captures the increasingly common trade-off between employing both an expensive similarity model (e.g. a large-scale embedding model), and a less accurate but cheaper model. Hence, the goal is to make as few queries to the strong oracle as possible. We consider both so-called ''point queries'', where the strong oracle is queried on a set of points SXS \subset \mathcal{X} and returns d(x,y)d(x,y) for all x,ySx,y \in S, and ''edge queries'' where it is queried for individual distances d(x,y)d(x,y). Our main contributions are optimal algorithms and lower bounds for clustering and Minimum Spanning Tree (MST) in this model. For kk-centers, kk-median, and kk-means, we give constant factor approximation algorithms with only O~(k)\tilde{O}(k) strong oracle point queries, and prove that Ω(k)\Omega(k) queries are required for any bounded approximation. For edge queries, our upper and lower bounds are both Θ~(k2)\tilde{\Theta}(k^2). Surprisingly, for the MST problem we give a O(logn)O(\sqrt{\log n}) approximation algorithm using no strong oracle queries at all, and a matching Ω(logn)\Omega(\sqrt{\log n}) lower bound. We empirically evaluate our algorithms, and show that their quality is comparable to that of the baseline algorithms that are given all true distances, but while querying the strong oracle on only a small fraction (<1%<1\%) of points
    corecore